Multiword Expression-Aware A$*$ TAG Parsing Revisited
نویسندگان
چکیده
A? algorithms enable efficient parsing within the context of large grammars and/or complex syntactic formalisms. Besides, it has been shown that promoting multiword expressions (MWEs) is a beneficial strategy in dealing with syntactic ambiguity. The state-of-the-art A? heuristic for promoting MWEs in tree-adjoining grammar (TAG) parsing has certain drawbacks: it is not monotonic and it composes poorly with grammar compression techniques. In this work, we propose an enhanced version of this heuristic, which copes with these shortcomings.
منابع مشابه
English Multiword Expression-aware Dependency Parsing Including Named Entities
Because syntactic structures and spans of multiword expressions (MWEs) are independently annotated in many English syntactic corpora, they are generally inconsistent with respect to one another, which is harmful to the implementation of an aggregate system. In this work, we construct a corpus that ensures consistency between dependency structures and MWEs, including named entities. Further, we ...
متن کاملDiscriminative Strategies to Integrate Multiword Expression Recognition and Parsing
The integration of multiword expressions in a parsing procedure has been shown to improve accuracy in an artificial context where such expressions have been perfectly pre-identified. This paper evaluates two empirical strategies to integrate multiword units in a real constituency parsing context and shows that the results are not as promising as has sometimes been suggested. Firstly, we show th...
متن کاملParsing Models for Identifying Multiword Expressions
Multiword expressions lie at the syntax/semantics interface and have motivated alternative theories of syntax like Construction Grammar. Until now, however, syntactic analysis and multiword expression identification have been modeled separately in natural language processing. We develop two structured prediction models for joint parsing and multiword expression identification. The first is base...
متن کاملProjecting Multiword Expression Resources on a Polish Treebank
Multiword expressions (MWEs) are linguistic objects containing two or more words and showing idiosyncratic behavior at different levels. Treebanks with annotated MWEs enable studies of such properties, as well as training and evaluation of MWE-aware parsers. However, few treebanks contain full-fledged MWE annotations. We show how this gap can be bridged in Polish by projecting 3 MWE resources o...
متن کاملPromoting multiword expressions in A* TAG parsing
Multiword expressions (MWEs) are pervasive in natural languages and often have both idiomatic and compositional readings, which leads to high syntactic ambiguity. We show that for some MWE types idiomatic readings are usually the correct ones. We propose a heuristic for an A? parser for Tree Adjoining Grammars which benefits from this knowledge by promoting MWEoriented analyses. This strategy l...
متن کامل